Initial Observations on Query Based Sampling in Distributed CLIR

نویسندگان

  • Xiao Mang Shou
  • Mark Sanderson
چکیده

Cross Language Information Retrieval (CLIR) enables people to search information written in different languages from their query languages. Information can be retrieved either from a single cross lingual collection or from a variety of distributed cross lingual sources. This paper presents initial results exploring the effectiveness of distributed CLIR using query-based sampling techniques, which to the best of our knowledge has not been investigated before. In distributed retrieval with multiple databases, query-based sampling provides a simple and effective way for acquiring accurate resource descriptions which helps to select which databases to search. Observations from our initial experiments show that the negative impact of query-based sampling on cross language search may not be as great as it is on monolingual retrieval.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

CLIR as Query Expansion as Logical Inference

Cross-language IR (CLIR) has been usually described as two separate steps: query translation and query evaluation. The uncertainty of the first step is not always integrated in the calculation of the correspondence between a document and the original query. In this paper, we try to develop a unified framework in which both steps are integrated. It will turn out that CLIR is a special case of qu...

متن کامل

1 On Bidirectional English - Arabic Search

In Cross-Language Information Retrieval (CLIR), queries in one language retrieve relevant documents in other languages. Machine-Readable Dictionaries (MRD) and Machine Translation (MT) systems are important resources for query translation in CLIR. We investigate the use of MT systems and MRD to Arabic-English and English-Arabic CLIR. The translation ambiguity associated with these resources is ...

متن کامل

Simple Query Translation Methods for Korean-English and Korean-Chinese CLIR in NTCIR Experiments

in goal of our participation in the NTCIR op is to evaluate relatively simple yet al methods for CLIR using Korean queries lish and Chinese documents. We employed ary-based query translation methods for cases but with different translation ity resolution techniques. The KoreanCLIR was quite successful, but the -Chinese CLIR resulted in unexpectedly rformance. While our analysis is still in s, w...

متن کامل

Improving Performance Of English-Hindi Cross Language Information Retrieval Using Transliteration Of Query Terms

The main issue in Cross Language Information Retrieval (CLIR) is the poor performance of retrieval in terms of average precision when compared to monolingual retrieval performance. The main reasons behind poor performance of CLIR are mismatching of query terms, lexical ambiguity and un-translated query terms. The existing problems of CLIR are needed to be addressed in order to increase the perf...

متن کامل

Translation-based ranking in cross-language information retrieval

Today’s amount of user-generated, multilingual textual data generates the necessity for information processing systems, where cross-linguality, i.e the ability to work on more than one language, is fully integrated into the underlying models. In the particular context of Information Retrieval (IR), this amounts to rank and retrieve relevant documents from a large repository in language A, given...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006